Virtual Modality: A Framework For Testing And Building Multimodal Applications

نویسندگان

  • Peter Pal Boda
  • Edward Filisko
چکیده

This paper introduces a method that generates simulated multimodal input to be used in testing multimodal system implementations, as well as to build statistically motivated multimodal integration modules. The generation of such data is inspired by the fact that true multimodal data, recorded from real usage scenarios, is difficult and costly to obtain in large amounts. On the other hand, thanks to operational speech-only dialogue system applications, a wide selection of speech/text data (in the form of transcriptions, recognizer outputs, parse results, etc.) is available. Taking the textual transcriptions and converting them into multimodal inputs in order to assist multimodal system development is the underlying idea of the paper. A conceptual framework is established which utilizes two input channels: the original speech channel and an additional channel called Virtual Modality. This additional channel provides a certain level of abstraction to represent non-speech user inputs (e.g., gestures or sketches). From the transcriptions of the speech modality, pre-defined semantic items (e.g., nominal location references) are identified, removed, and replaced with deictic references (e.g., here, there). The deleted semantic items are then placed into the Virtual Modality channel and, according to external parameters (such as a pre-defined user population with various deviations), temporal shifts relative to the instant of each corresponding deictic reference are issued. The paper explains the procedure followed to create Virtual Modality data, the details of the speech-only database, and results based on a multimodal city information and navigation application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MMIR Framework: Multimodal Mobile Interaction and Rendering

In this article, we present the MMIR (Multimodal Mobile Interaction and Rendering) framework that is targeted at building lightweight dialog systems. The framework is geared towards multimodal interaction development for applications where only limited resources are available, e.g. so that they can run self-contained even on mobile devices. The framework utilizes SCXML for modeling application-...

متن کامل

Natural Language Navigation Support in Virtual Reality

We describe our work on designing a natural language accessible navigation agent for a virtual reality (VR) environment. The agent is part of an agent framework, which means that it can communicate with other agents. Its navigation task consists of guiding the visitors in the environment and to answer questions about this environment (a theatre building). Visitors are invited to explore this bu...

متن کامل

A Toolkit for Creating and Testing Multimodal Interface Designs

Designing and implementing applications that can handle multiple recognition-based interaction technologies such as speech and gesture inputs is a difficult task. IMBuilder and MEngine are the two components of a new toolkit for rapidly creating and testing multimodal interface designs. First, an interaction model is specified in the form of a collection of finite state machines, using a simple...

متن کامل

A Haptic-Enabled Multimodal Interface and Framework for Preoperative Planning of Total Hip Arthroplasty

ultimodal environments seek to create computational scenarios that fuse sensory data (sight, sound, touch, and perhaps smell) to form an advanced, realistic, and intuitive user interface. This can be particularly compelling in medical applications, where surgeons use a range of sensory motor cues.1-4 Sample applications include simulators, education and training, surgical planning, and scientif...

متن کامل

Multimodal fusion system for NDT and Metrology

3D scanning and modeling is used in a wide range of applications like manufacturing, aerospace, security, biomedicine, etc. This kind of systems allows the scanning of a 3D point cloud that can be meshed and rendered in order to obtain a complete 3D model. However, 3D vision cannot detect subsurface defects. This later can be achieved using other imaging modalities, like thermal infrared imagin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004